A Performance Analysis of Fault Recovery in Stream Processing Frameworks

نویسندگان

چکیده

Distributed stream processing frameworks have gained widespread adoption in the last decade because they abstract away complexity of parallel processing. One their key features is built-in fault tolerance. In this work, we dive deeper into implementation, performance, and efficiency critical feature for four state-of-the-art frameworks. We include established Spark Streaming Flink more novel Structured Kafka Streams test behavior under different types faults settings: master failure with without high-availability setups, driver failures frameworks, worker or exactly-once semantics, application task failures. highlight differences during these on several aspects, e.g., whether there an outage, downtime, recovery time, data loss, duplicate processing, accuracy, cost message delivery guarantees. Our results impact framework design speed explain how use cases may benefit from approaches. Due to task-based scheduling approach, can recover within 30 seconds most necessitating restart. has only a few but slower at catching up delays. Finally, offer end-to-end semantics low requires job restarts leading high times around 50 seconds.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

a swot analysis of the english program of a bilingual school in iran

با توجه به جایگاه زبان انگلیسی به عنوان زبانی بین المللی و با در نظر گرفتن این واقعیت که دولت ها و مسئولان آموزش و پرورش در سراسر جهان در حال حاضر احساس نیاز به ایجاد موقعیتی برای کودکان جهت یاد گیری زبان انگلیسی درسنین پایین در مدارس دو زبانه می کنند، تحقیق حاضر با استفاده از مدل swot (قوت ها، ضعف ها، فرصتها و تهدیدها) سعی در ارزیابی مدرسه ای دو زبانه در ایران را دارد. جهت انجام این تحقیق در م...

15 صفحه اول

Toward High-Performance Distributed Stream Processing via Approximate Fault Tolerance

Fault tolerance is critical for distributed stream processing systems, yet achieving error-free fault tolerance often incurs substantial performance overhead. We present AF-Stream, a distributed stream processing system that addresses the trade-off between performance and accuracy in fault tolerance. AF-Stream builds on a notion called approximate fault tolerance, whose idea is to mitigate back...

متن کامل

Heterogeneity-aware scheduler for stream processing frameworks

This article discusses problems and decisions related to scheduling of stream processing applications in heterogeneous clusters. An overview of the current state of the art of the stream processing on heterogeneous clusters with a focus on resource allocation and scheduling is presented first. Then, common scheduling approaches of various stream processing frameworks are discussed and their lim...

متن کامل

Fault Analysis of Stream Ciphers

A fault attack is a powerful cryptanalytic tool which can be applied to many types of cryptosystems which are not vulnerable to direct attacks. The research literature contains many examples of fault attacks on public key cryptosystems and block ciphers, but surprisingly we could not find any systematic study of the applicability of fault attacks to stream ciphers. Our goal in this paper is to ...

متن کامل

Fault Tolerance for Stream Processing Engines

Distributed Stream Processing Engines (DSPEs) target applications related to continuous computation, online machine learning and real-time query processing. DSPEs operate on high volume of data by applying lightweight operations on real-time and continuous streams. Such systems require clusters of hundreds of machine for their deployment. Streaming applications come with various requirements, i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Access

سال: 2021

ISSN: ['2169-3536']

DOI: https://doi.org/10.1109/access.2021.3093208